Detailed description of triphone model using SSS-free algorithm

نویسندگان

  • Motoyuki Suzuki
  • Daisuke Honma
  • Akinori Ito
  • Shozo Makino
چکیده

The triphone model is frequently used as an acoustic model. It is effective for modeling phonetic variations caused by coarticulation. However, it is known that acoustic features of phonemes are also affected by other factors such as speaking style and speaking speed. In this paper, a new acoustic model is proposed. All training data which have the same phoneme context are automatically clustered into several clusters based on acoustic similarity, and a “sub-triphones” is trained using training data corresponding to a cluster. In experiments, the sub-triphone model achieved about 5% higher phoneme accuracy than the triphone model.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Shared-distribution hidden Markov models for speech recognition

Parameter sharing plays an important role in statistical modeling since training data are usually limited. On the one hand, we would like to use models that are as detailed as possible. On the other hand, with models too detailed, we can no longer reliably estimate the parameters. Triphone generalization may force two models to be merged together when only parts of the model output distribution...

متن کامل

Distinct triphone acoustic modeling using deep neural networks

To strike a balance between robust parameter estimation and detailed modeling, most automatic speech recognition systems are built using tied-state continuous density hidden Markov models (CDHMM). Consequently, states that are tied together in a tied-state are not distinguishable, introducing quantization errors inevitably. It has been shown that it is possible to model (almost) all distinct tr...

متن کامل

Voice Assimilation Phenomenon and Its Implementation in LVCSR System with Lexical Tree and Bigram Language Model

In this paper a LVCSR system with implementation of the Czech voice assimilation phenomenon is proposed. The recognition system uses lexical trees and a bigram language model. The first part of this article is focused on voice assimilation phenomenon description, triphone lexical tree construction, and voice assimilation impact on LVCSR system performance. The second part outlines lexical tree ...

متن کامل

A Successive State Splitting Algorithm Based on the Mdl Criterion by Data-driven and Decision Tree Clustering

We propose a new Successive State Splitting (SSS) algorithm based on the Minimum Description Length (MDL) criterion to design tied-state HMM topologies automatically. The SSS algorithm is a mechanism for creating both temporal and contextual variations based on the Maximum Likelihood (ML) criterion. However, it also needs to empirically predetermine control parameters for use as stop criteria, ...

متن کامل

Coarticulation modeling by embedding a target-directed hidden trajectory model into HMM - model and training

We propose and evaluate a new acoustic model that combines HMM and a special type of the hidden dynamic model (HDM) – a target-directed hidden trajectory model – into a single integrated model named HTHMM. The new model provides a computational model of coarticulation by representing the internal dynamics of human speech based on the hidden trajectory of the vocal-tract resonances. This paper f...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009